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Abstract. We prove a CLT for spectra of submatrices of real symmetric and Her- 
mitian Wigner matrices. We show that if in the standard normalization the fourth 
moment of the off-digonal entries is GOE/GUE-like then the limiting Gaussian 
process can be viewed as a collection of simply yet nontrivially correlated two- 
dimensional Gaussian Free Fields. 



Introduction. Gaussian global fluctuations of eigenvalues of GUE, GOE, Wigner 
random matrices, and their generalizations is a well-studied subject, see e.g. Chap- 
ter 2 of [AGZ] and Chapter 9 of [BS] as well as references therein. One would 
usually concentrate on studying the spectrum of the full matrix, but it comes as 
no surprise that for large submatrices with a regular limiting behavior, the joint 
fluctuations would still be Gaussian. We prove this fact by a slight modification of 
the moment method presented in [AGZ]. 

It becomes more interesting when one looks at the limiting covariance structure. 
In what follows we assume that in the standard normalization the fourth moment 
of the off-diagonal entries of our matrices is the same as for GOE/GUE. 

The first statement is that for such a (real symmetric or Hermitian) Wigner 
matrix, the joint fluctuations of spectra of nested submatrices formed by cutting out 
top left corners are described by the two-dimensional Gaussian Free Field (GFF), 
see e.g. [S] for definitions and basic properties of GFFs. 

Although this result seems to be new, the appearance of the GFF is also not too 
surprising. Indeed, as was shown in [JN] and [OR] , for GUE the eigenvalue ensemble 
of nested matrices arises as a limit of random surfaces, and for random surfaces the 
relevance of the GFF is widely anticipated, see [K], [BF] for rigorous results and 
further references. One might argue however that the GFF interpretation simplifies 
the description of the covariance in the one-matrix case, cf. Proposition 3 below. 

The real novelty comes when one considers joint fluctuations for different nested 
sequences of submatrices. For each of the nested sequences the fluctuations are 
again described by the GFF. On the other hand, when different sequences have 
nontrivial and asymptotically regular intersections, these GFFs are correlated, and 
the exact form of the covariance kernel turns out to be simple. One could argue 
that it is as simple as one could hope for. 

The resulting Gaussian process unites a large family of mutually correlated 
GFFs. Even for two GFFs the resulting Gaussian process seems to be new. An effi- 
cient description of the largest natural state space for this Gaussian process remains 
an open problem. 



I 



Typeset by AmS-Te£ 



2 



ALEXEI BORODIN 



It is natural to ask how univeral the limiting process is. We believe that it also 
arises in the world of random surfaces, although it is not a priori clear how to vary 
the nested sequence there. The answer comes from representation theory — one 
views random surfaces as originating from restricting suitable representations to 
a maximal commutative subalgebra and then one varies that subalgebra. We will 
address these models in a later publication. 

Acknowledgements. The author is very grateful to Grigori Olshanski and Ofer 
Zeitouni for valuable comments. The work was partially supported by NSF grant 
DMS- 1056390. 

Wigner matrices. Let {-^ij}j>i>i and {i^}i>i be two families of independent 
identically distributed real-valued random variables with zero mean such that for 
any k > 1 

max(E|Z 12 | fe ,E|Y 1 | fe ) < oo. 

Assume also that 

EY] 2 = 2, EZ% 2 = 1, KZf 2 = 3. 
Define a (real symmetric) Wigner matrix X by 



X(i,j) = X(j,i) 



Yi, i=j- 



An Hermitian variation of the same definiton is as follows: Let {^}j>i>i now be 
complex-valued (i.i.d. mean zero) random variables with the same uniform bound 
on all moments. Assume that 

Elf = l, E\Z 12 \ 2 = l, E|Z 12 | 4 = 2. 

Define an Hermitian Wigner matrix X by 

Zij, i < j, 
Yi, i=j. 



X(i,j) = X(j,i) 



In the case when all the random variables Yi, Z^ (or Yi, s RZij, ^sZij in the Her- 
mitian case) are Gaussian, the Wigner matrix is said to belong to the Gaussian 
Orthogonal Ensemble (GOE) in the real case, and Gaussian Unitary Ensemble 
(GUE) in the Hermitian case. 

For any finite set B C {1, 2, . . . } we denote by X(B) the \B\ x |S| submatrix of 
the (real symmetric or Hermitian) Wigner matrix X formed by the intersections of 
the rows and columns of X marked by elements of B. Clearly, the distribution of 
X(B) depends only on \B\. 

Traditionally one encodes the real symmetric and the Hermitian cases by a pa- 
rameter (3 that takes value 1 for GOE and value 2 for GUE. 

The height function. Let A = {a n } n >\ be an arbitrary sequence of pairwise 
distinct natural numbers. The height function Ha associated to A and a Wigner 
matrix A is a random integer- valued function on R x R>i defined by 

Ha(x, y) = \j {the number of eigenvalues of X({a\, . . . , a^}) that are > x}. 
The convenience of the constant prefactor y/ (3tt/2 will be evident shortly. 
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Good families of sequences. In what follows L > is a large parameter. 

Let {Ai} ie i be a family of sequences of pairwise distinct natural numbers. As- 
sume they all depend on L. Denote 

Ai = {oi,„}„>i, A itm = {a it i, . . . ,o», m }, i€l, m g N. 

We say that {Ai} ie i is a good family if for any i,j e J and x, y e M>o there 
exists a limit 



ot{i,x;j,y) = lim 



|^i,[xL] nA j,[yL]\ 



Here is an example of a good family: J = {1, 2, 3, 4} and 

ai,„ = n, a 2 ,„ = 2n, a 3i „ = 2n + 1, a 4i „ = < 



' n + L, n < L, 
n- L, L <n <2L, 
n, n > 2L. 



Note, however, that the index set I does not have to be finite. 

Correlated Gaussian Free Fields. Let {Ai} i£ i be a good family of sequences 
as above. Take a family of copies of the upper half-plane H — {z e C | 3z > 0} 
indexed by I and consider their union 

H(J) = |jH i . 
Introduce a function C : H(J) x H(7) -)• E U {-oo} via 



Cij{z,w) = — In 



a(i, |z| 2 ; j, |w| 2 ) — zw 



a(i, |z| 2 ; j, |w| 2 ) — zw 
where a( • ) is as above. Note that for i = j 



i,j <E I, z g io g 



Cii(z, w) = — In 



min(|z| 2 , \w\ ) — zw 



min(|z| 2 , |w| 2 ) — zw 



= -J-ln 


z — w 






2vr 


z — w 



is the Green function for the Laplace operator on H with Dirichlct boundary con- 
ditions. 

Proposition 1. For any good family of sequences as above, there exists a general- 
ized Gaussian process on H(7) with the covaraince kernel C(z,w) as above. More 
exactly, for any finite family of test functions f m (z) g Co(Hj m ) and i\, . . . ,im g I, 
the covariance matrix 



fk{z)fi{w)C ikil {z,w) dzdzdwdw, k,l = 1,...,M, 



H JI 



cov(f k Ji) 

is positive-definite. 

Denote the resulting generalized Gaussian process by Q{Ai} ieI - 
The proof of Propositon 1 will be given later. 
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Complex structure. Let A be a sequence of pairwise distinct integers. The height 
function Ha is naturally defined on K x M>i. Having the large parameter L, we 
would like to scale (x, y) ^ (L~?x, L _1 y), which lands us in R x M>o- 

Wigner's semicircle law implies that with L » 1, x <~ L2, y <~ L, after rescal- 
ing with overwhelming probability the eigenvalues (or, equivalently, the places of 
growth of the height function in x-direction) are concentrated in the domain 

{(x,y)eRxR >0 I -2^<x<2^}. 

Let us identify the interior of this domain with H via the map 



X / / X \ 

: (x,y) ^ 2+i]/y- (^) ■ 

Its inverse has the form 

n-\z) = (x(z),y(z)) = (m(z),\z\ 2 ). 

Note that this map sends the boundary of the domain to the real line. 

Thanks to O we can now speak of the height function Ha as being defined on 
H; we will use the notation 

H%(z)=H A (L*x(z),Ly(z)), ztU. 
Note that we have incorporated rescaling in this definition. 

Main result. Let X be a (real symmetric or Hcrmitian) Wigner matrix. Let 
{Ai} iG i be a good family of sequences. We argue that the collection of the central- 
ized random height functions 

H%( Zi )-EH%(zi), is I, z,eH„ 

viewed as distributions, converges as L — > 00 to the generalized Gaussian process 

S{A t } teI - 

One needs to verify the convergence on a suitable set of test functions. The exact 
statement that we prove is the following. 

Theorem 2. Pick i e /, y > 0, and k e Z>o- Define a moment of the random 
height function by 



M i,v 



/+00 
x k (H Az {L^x, Ly) - EH At (L*x, Ly))dx. 
-00 



Then as L — > 00, these moments converge, in the sense of finite dimensional dis- 
tributions, to the moments ofQ^ Ai } ieI defined as 

Mi, v , k = I (x(z)) k g {AtheI (z)^-dz. 

JzeMi,\z\ 2 =y UZ 
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Moments as traces. Let us rescale the variable x = L~?u in the definition of 
Mi tVt k and then integrate by parts. Since the derivative of the height function 
Ha^u, [Ly]) in u is 



, nr~ ILy] 

^H Ai (u, [Ly}) = - ] j!fJ25(u-\ s ), 



where {X s }i< s <[Ly] arc the eigenvalues of X(Ai^ Ly ]), we obtain 



L 2 



(Tr(X(A Mi3y] ) fe+1 ) -ETv(X(A h[Ly] ) k+1 )) . 



We can now reformulate the statement of Theorem 2 as follows. 

Theorem 2'. Let X be a Wigner matrix. Let k\, . . . ,k m > 1 be integers, and 
let Bi , . . . , B m be subsets of N dependent on the large parameter L such that there 
exists limits 



lim \M > o, 

L— !-oo _L 



lim ^ n ^l 

L^oo L 



p,q=l,...,m. 



Then the m-dimensional random vector 
(1) 



L~~5~ (rr(X(B p ) k ") - ETi(X(B p ) k ")^ 



converges (in distribution and with all moments) to the zero mean m-dimensional 
Gaussian random variable (£ p )£Li wii/i £/ie covariance 



(2) E£ p £, 



2 A/jr) 
/3tt 



(x(z)) k ^- 1 (x(w)) k "- 1 — In 

27T 



c&e(;j) dx(w) 
dz dw 



dzdw. 



\z\ 2 =b p \w\ 2 =b q 
9z>0 £™>0 



Proof of Theorem 2'. The argument closely follows that given in Section 2.1.7 
of [AGZ] in the case of one set Bj = B. One proves the convergence of moments, 
which is sufficient to also claim the convergence in distribution for Gaussian limits. 

Any joint moment of the coordinates of (1) is written as a finite combination 
of contributions corresponding to suitably defined graphs that are in their turn 
associated to words. The only difference of the multi-set case with the one-set case 
is that one needs to keep track of the alphabets these words are built from: A 
word corresponding to coordinate number p of (1) would have to be built from the 
alphabet that coincides with the set B p . Equivalcntly, the corresponding graphs 
will have their vertices labeled by elements of B p . 
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Since all sizes \B p \ have order L, and \B\ U • • • U B m \ = O(L), the estimate 
showing that all contributions not coming from matchings are negligible (Lemma 
2.1.34 in [AGZ]) carries over without difficulty. It only remains to compute the 
covariance. 

For real symmetric Wigner matrices in the one-set case the limits of the variances 
of the coordinates of (1) are given by (2.1.44) in [AGZ]. It reads (with k = k p for 
a p between 1 and m) 

A V 

(3) 2t j cL 1 + fc 2 c| + f;^ y. lift. 

r = 3 fe,>0 i=1 

\2Ei=i fe * = k ~ r J 

where {Ck}k>i arc the Catalan numbers, and we assume C a = unless a € 
{0, 1,2, . . .}. The Catalan number counts the number of rooted planar trees 
with k edges, and different terms of (3) have the following interpretation (see [AGZ] 
for detailed explanations): 

• The first term comes from two trees with (fc — l)/2 edges each that hang from 
a common vertex; the factor k 2 originates from choices of certain starting points 
on each tree united with the common vertex, and the extra 2 is actually KY 2 . 

• The second term comes from two trees with fc/2 edges each that are glued 
along one edge. There are fc/2 choices of this edge for each of the trees, there is an 
additional 2 = EZf 2 — 1, and another addional 2 responsible of the choice of the 
orientation of the gluing. 

• The third term comes from two graphs each of which is a cycle of length r 
with pendant trees hanging off each of the vertices of the cycle; the total number of 
edges in the extra trees being (fc — r)/2 (this must be an integer). As for the first 
term, there is an extra fc 2 = k ■ k coming from the choice of the starting points and 
also an extra 2 for the choice of the gluing orientation along the cycle. 

For each of the three terms the total number of vertices in the resulting graph is 
equal to fc, and if one labels each vertex with a letter from an alphabet of cardinality 
\B\ this would yield a factor of 

\B\(\B\-l)---(\B\-k + l) = \B\ k + 0(\B\ k - 1 ). 

Normalization by \B\ k yields (3). 

In the general case, in order to evaluate the covariance 

(4) 

[(Tr(X(B p ) fe ") -ETv(X(B p ) k ")) (Tr(X(B g ) k °) - ETr(X(B q ) kq ))] 

in the limit, we need to employ the same graph counting, except for the two graphs 
being glued now correspond to different values k p and k q of fc, and their vertices 
are marked by letters of different alphabets B p and B q . 

• The first term gives 2k p k q Ck p -i Ck q -i for the graph counting, and an extra 

2 2 

|B p nB,|.(|B p |-l)(|B p |-2)...(|B p |-^).(|B,|-l)(|B,| -2)-- .(\B q \-^) 
for the vertex labeling (the factor \B p <~) B q \ comes from the only common vertex). 

kp-\-kq 

Normalized by L 2 this yields 

kp — 1 k q — 1 

QikpkqC kp — 1 C kg — 1 Cpqbp bq 
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k p 

• The second term has k p k q Ck p Ck q from the graph counting and (? b p 2 ± b q 2 

2 2 

from the label counting; a total of 

fcp fcq C kp^ C kq^ Cp~ b p 2 b q 2 

2 2 

• For the third term in the same way we obtain 



/ 



T 



r=3 



e 

i;>0 



e s 

Thus, the asymptotic value of the covariance (4) is 



r r h 2 h 2 



kp 1 fcg— 1 kp ^ fcg ^ 

*2hpkqCk p — iCk q — iCpqbp 2 bq 2 ~\~ hpkqC k p C k q Cpqbp 2 bq 



r=3 



e 



s,>0 



i=l 



e 



t,>0 



u pq u P u q 



We now use the fact that for any S — 0, 1, 2, . . . 



e n^= 



Si>0 i=1 



/2S + 
V 5 )2S + r 



see (5.70) in [GKP]. This allows us to rewrite the asymptotic covariance in terms 
of binomial coefficients: 



{kp - l)/2j \(k q - l)/2 



kp — 1 kq — 1 

f-^pqUp Uq 



+ 4 



kp 

kp/2-lJ \k q /2-l 



r 2 h 2 h 2 



+E 2 ^ 



r=3 



r r h 2 h 2 



(kp-r)/2)\(k q -r)/2)^ p q 



E 2? 



f \{k p -r)/2)\{k q -r)l2) ^ " q 



Kp r Kq 



Using the binomial theorem, we can write this expression as a double contour 
integral 



(5) 



(2m) 2 



II 



z + 



w 



c pq dzdw 



const i = | z | < \w | —const2 



wj b p {^-z-wf 
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Consider the right-hand side of (2) and assume that \z\ 2 = b p < b q = \w\ 2 . 
Observe that 



Cpq 


— ZW 


Cpq 


— ZW 




,„(: 



-2 In 



2 In 



bp J \ bp J y bp J y b p 

This allows us to rewrite the right-hand side of (2) as a double contour integral 
over complete circles in the form 

I I { X ( z )p-\ x (w)p-i In ( °-^z - w] dX j z) dX j w) dzdw. 
2/3tt 2 / J \b p J dz dw 

\z\ 2 =b„ \w\ 2 =b q 

Recalling that /3 = 1 and noting that 

, . , ,, fc _ x dx(z) d(x(z)) k p . _,dx(w) d(x(w)) k « 

we integrate by parts in z and w and recover (5). The proof for for b p = b q is 
obtained by continuity of both sides, and to see that the needed identity holds for 
bp > b q it suffices to observe that both sides are symmetric in p and q. 

The argument in the case of Hermitian Wigner matrices is exactly the same, 
except in the combinatorial part for the first term the factor 2 is missing due to the 
change in El^ 2 , in the second term 2 is missing due to the change in E|Zi2| 4 , and 
in the third term 2 is missing because there is no choice in the orientation of two 
r-cycles that are being glued together. □ 

Proof of Proposition 1. We need to show that for any complex numbers {uk}^ =1 



V] U k u l / / 

fc/=i JmJm 



fk(z)fi(w)C ikil (z,w) dzdzdwdw > 0. 



We can approximate the integration over the two-dimensional domains by finite 
sums of one-dimensional integrals over semi-circles of the form \z\ = const. On 
each semi-circle we further uniformly approximate the (continuous) integrand by 
a polynomial in 3ft(z). Finally, for the polynomials the nonnegativity follows from 
Theorem 2'. □ 

Chebyshev polynomials. One way to describe the limiting covariance structure 
in the one-matrix case is to show that traces of the Chebyshev polynomials of 
the matrix are asymptotically independent, see [J]. A similar effect takes place for 
submatrices as well. 

For n = 0, 1, 2, . . . let T n (x) be the nth degree Chebyshev polynomial of the first 
kind: 

T n (x) = cos(narccosa;), T„(cos(x)) = cos(nx). 
For any a > 0, let T%(x) — T n (|) be the rescaled version of T n . 
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Proposition 3. In the assumptions of Theorem 2', for any p,q~ 1, . . . , m 



lim E 

L— >oo 



MT 2 k f^(X(B p ))) -ETr(T 2 k f^(X(B p ))) 



' I Tr(T^V(B 9 ))) -ETr(T 2 k f^(X(B q ))) 



S — 



2/3 \ ^b~F q 



Proof. Using (5) and assuming b p < b q we obtain 



lim E 

L— >oo 



HTlf^{X{B p )))-E^(Tlf^{X{B p ))) 



^(Tlf^ {X {B q ))) -ETr(T 2 k f^(X(B q ))) 



r\2 



T kp (cos(arg(z))T fc? (cos(arg(w)) 



dzdw 



6 P — | z | < | w | =b q 



i 



2^(2ttz) 2 



b p = \z\<\w\=b q 




W 

dzdw 



b p (ftz-w)*- 



Writing (- 



id) 2 as a series in z/w we arrive at the result. Continuity and 



symmetry of both sides of the limiting relation removes the assumption b p < b q . □ 
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